Self-Tuning Parameters for Decision Tree Algorithm Based on Big Data Analytics
نویسندگان
چکیده
Big data is usually unstructured, and many applications require the analysis in real-time. Decision tree (DT) algorithm widely used to analyze big data. Selecting optimal depth of DT time-consuming process as it requires iterations. In this paper, we have designed a modified version (DT). The aims achieve by self-tuning running parameters improving accuracy. efficiency was verified using two datasets (airport fire datasets). airport dataset has 500000 instances 600000 instances. A comparison been made between standard with results showing that performs better. This conducted on multi-node Apache Spark tool Amazon web services. Resulting accuracy an increase 6.85% for first 8.85% dataset. conclusion, showed better handling different-sized compared algorithm.
منابع مشابه
Starfish: A Self-tuning System for Big Data Analytics
Modern industrial, government, and academic organizations are collecting massive amounts of data (“big data”) at an unprecedented scale and pace. The ability to perform timely and costeffective analytical processing of such large datasets to extract deep insights is now a key ingredient for success. These insights can drive automated processes for advertisement placement, improve customer relat...
متن کاملA Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection
Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....
متن کاملUsing 'Big Data' for analytics and decision support
People and the computers they use are generating large amounts of varied data. The phenomenon of capturing and trying to use all of the semi-structured and unstructured data has been called by vendors and bloggers "Big Data". Organizations can capture and store data of many types from almost any source, but capturing and storing data only adds value when it has a useful purpose. Big Data must b...
متن کاملA New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining
Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...
متن کاملApplication of Big Data Analytics in Power Distribution Network
Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computers, materials & continua
سال: 2023
ISSN: ['1546-2218', '1546-2226']
DOI: https://doi.org/10.32604/cmc.2023.034078